A New Model-Based Mandarin-Speech Coding System
نویسندگان
چکیده
In this paper, a new model-based Mandarin-speech coding system is proposed. It employs a prosody-enriched ASR with a hierarchical prosodic model (HPM) to generate from the input speech enriched transcriptions, including linguistic features, prosodic tags and spectral parameters in the encoder. By sending these features to the decoder, we can first reconstruct the prosodic-acoustic features of syllable pitch contour, syllable duration, syllable energy level, and intersyllable pause duration by HPM using the linguistic features and prosodic tags; and then combined with spectral parameters to reconstruct the input speech signal by an HMMbased speech synthesizer. Experimental results show that the reconstructed speech has good quality at a low data rate of 543 bits/s.
منابع مشابه
A novel hybrid approach for Mandarin speech synthesis
The paper investigates a new method to solve concatenation problems of Mandarin speech synthesis which is based on the hybrid approach of HMM-based speech synthesis and unit selection. Unlike other works which use only boundary F0 errors as concatenation cost, a CART based F0 dependency model which considers much context information is trained to measure smoothness of F0. Instead of phoneme-siz...
متن کاملOn Online Attention-Based Speech Recognition and Joint Mandarin Character-Pinyin Training
In this paper, we explore the use of attention-based models for online speech recognition without the usage of language models or searching. Our model is based on an attention-based neural network which directly emits English/Mandarin characters as outputs. The model jointly learns the pronunciation, acoustic and language model. We evaluate the model for online speech recognition on English and...
متن کاملIncorporating Pitch Features for Tone Modeling in Automatic Recognition of Mandarin Chinese
Tone plays a fundamental role in Mandarin Chinese, as it plays a lexical role in determining the meanings of words in spoken Mandarin. For example, these two sentences R R (I like horses) and R M (I like to scold) differ only in the tone carried by the last syllable. Thus, the inclusion of tone-related information through analysis of pitch data should improve the performance of automatic speech...
متن کاملEnhancement of hearing-impaired Mandarin speech
This paper presents a new voice conversion system that modifies misarticulations and prosodic deviations of the hearingimpaired Mandarin speech. The basic strategy is the detection and exploitation of characteristic features that distinguish the impaired speech from the normal speech at segmental and prosodic levels. For spectral conversion, cepstral coefficients were characterized under the fo...
متن کاملAn Analysis-by-Synthesis Study of Mandarin Speech Prosody
In the present paper an analysis by synthesis study of mandarin speech prosody is carried out. The mandarin prosodic features are discussed from two salient perspectives, specifically: the function of prosody and the form of prosody. The symbolic representation of prosodic form with the INTSINT (INternational Transcription System for INTonation) system [1] reduces the surface complexity of a pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011